Recurrent unit augmented memory network for video summarisation

نویسندگان

چکیده

Video summarisation can relieve the pressure on video storage, transmission, archiving, and retrieval caused by explosive growth of online videos in recent years. Most existing supervised methods use convolutional neural network (CNN) or recurrent (RNN) to model temporal dependencies between frames shots. CNN mainly focuses local information, RNN loses long-term information when input sequence is long, both which have limited ability obtain long-range memory video. Therefore, a unit augmented (RUAMN) for proposed, effectively utilises extraction end-to-end (MemN2N) solves problem that MemN2N insensitive information. At same time, proposed RUAMN enhances process update multiple computational steps (hops), finally generates meaningful result. Specifically, composed module, global-and-local sampling, module output module. The uses bidirectional GRU forward backward each frame. Then sampling performs global respectively several shorter sequences, so modules capture fine-grained relationship features more effectively. extracts feature sequence, frame-level importance scores are predicted Extensive experiments benchmark datasets, is, TVSum SumMe, demonstrate superiority our method over state-of-the-art methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-Source Video Summarisation

Many visual surveillance tasks, e.g.video summarisation, is conventionally accomplished through analysing imagerybased features. Relying solely on visual cues for public surveillance video understanding is unreliable, since visual observations obtained from public space CCTV video data are often not sufficiently trustworthy and events of interest can be subtle. On the other hand, non-visual dat...

متن کامل

MAVOT: Memory-Augmented Video Object Tracking

We introduce a one-shot learning approach for video object tracking. The proposed algorithm requires seeing the object to be tracked only once, and employs an external memory to store and remember the evolving features of the foreground object as well as backgrounds over time during tracking. With the relevant memory retrieved and updated in each tracking, our tracking model is capable of maint...

متن کامل

A shortest path representation for video summarisation

A novel approach is presented to select multiple key frames within an isolated video shot where there is camera motion causing significant scene change. This is achieved by determining the dominant motion between frame pairs whose similarities are represented using a directed weighted graph. The shortest path in the graph, found using the search algorithm, designates the key frames. The overall...

متن کامل

Augmented Transition Network as a Semantic Model for Video Data

An abstract semantic model called the augmented transition network (ATN), which can model video data and user interactions, is proposed in this paper. An ATN and its subnetworks can model video data based on different granularities such as scenes, shots and key frames. Multimedia input strings are used as inputs for ATNs. Key frame selection is based on temporal and spatial relations of semanti...

متن کامل

Comparison of multi-episode video summarisation algorithms

This paper presents a comparison of some methodologies for the automatic construction of video summaries. The work is based on the Simulated User Principle to evaluate the quality of a video summary in a way, which is automatic, yet related to user's perception. The method is studied for the case of multi-episode video. Where we don’t only describe what is important in a video, but rather what ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Iet Computer Vision

سال: 2023

ISSN: ['1751-9632', '1751-9640']

DOI: https://doi.org/10.1049/cvi2.12194